37 research outputs found

    A large-scale dataset for end-to-end table recognition in the wild

    Full text link
    Table recognition (TR) is one of the research hotspots in pattern recognition, which aims to extract information from tables in an image. Common table recognition tasks include table detection (TD), table structure recognition (TSR) and table content recognition (TCR). TD is to locate tables in the image, TCR recognizes text content, and TSR recognizes spatial ogical structure. Currently, the end-to-end TR in real scenarios, accomplishing the three sub-tasks simultaneously, is yet an unexplored research area. One major factor that inhibits researchers is the lack of a benchmark dataset. To this end, we propose a new large-scale dataset named Table Recognition Set (TabRecSet) with diverse table forms sourcing from multiple scenarios in the wild, providing complete annotation dedicated to end-to-end TR research. It is the largest and first bi-lingual dataset for end-to-end TR, with 38.1K tables in which 20.4K are in English\, and 17.7K are in Chinese. The samples have diverse forms, such as the border-complete and -incomplete table, regular and irregular table (rotated, distorted, etc.). The scenarios are multiple in the wild, varying from scanned to camera-taken images, documents to Excel tables, educational test papers to financial invoices. The annotations are complete, consisting of the table body spatial annotation, cell spatial logical annotation and text content for TD, TSR and TCR, respectively. The spatial annotation utilizes the polygon instead of the bounding box or quadrilateral adopted by most datasets. The polygon spatial annotation is more suitable for irregular tables that are common in wild scenarios. Additionally, we propose a visualized and interactive annotation tool named TableMe to improve the efficiency and quality of table annotation

    Disentangling Writer and Character Styles for Handwriting Generation

    Full text link
    Training machines to synthesize diverse handwritings is an intriguing task. Recently, RNN-based methods have been proposed to generate stylized online Chinese characters. However, these methods mainly focus on capturing a person's overall writing style, neglecting subtle style inconsistencies between characters written by the same person. For example, while a person's handwriting typically exhibits general uniformity (e.g., glyph slant and aspect ratios), there are still small style variations in finer details (e.g., stroke length and curvature) of characters. In light of this, we propose to disentangle the style representations at both writer and character levels from individual handwritings to synthesize realistic stylized online handwritten characters. Specifically, we present the style-disentangled Transformer (SDT), which employs two complementary contrastive objectives to extract the style commonalities of reference samples and capture the detailed style patterns of each sample, respectively. Extensive experiments on various language scripts demonstrate the effectiveness of SDT. Notably, our empirical findings reveal that the two learned style representations provide information at different frequency magnitudes, underscoring the importance of separate style extraction. Our source code is public at: https://github.com/dailenson/SDT.Comment: accepted by CVPR 2023. Source code: https://github.com/dailenson/SD

    An Integrated Bioinformatics Approach Identifies Elevated Cyclin E2 Expression and E2F Activity as Distinct Features of Tamoxifen Resistant Breast Tumors

    Get PDF
    Approximately half of estrogen receptor (ER) positive breast tumors will fail to respond to endocrine therapy. Here we used an integrative bioinformatics approach to analyze three gene expression profiling data sets from breast tumors in an attempt to uncover underlying mechanisms contributing to the development of resistance and potential therapeutic strategies to counteract these mechanisms. Genes that are differentially expressed in tamoxifen resistant vs. sensitive breast tumors were identified from three different publically available microarray datasets. These differentially expressed (DE) genes were analyzed using gene function and gene set enrichment and examined in intrinsic subtypes of breast tumors. The Connectivity Map analysis was utilized to link gene expression profiles of tamoxifen resistant tumors to small molecules and validation studies were carried out in a tamoxifen resistant cell line. Despite little overlap in genes that are differentially expressed in tamoxifen resistant vs. sensitive tumors, a high degree of functional similarity was observed among the three datasets. Tamoxifen resistant tumors displayed enriched expression of genes related to cell cycle and proliferation, as well as elevated activity of E2F transcription factors, and were highly correlated with a Luminal intrinsic subtype. A number of small molecules, including phenothiazines, were found that induced a gene signature in breast cancer cell lines opposite to that found in tamoxifen resistant vs. sensitive tumors and the ability of phenothiazines to down-regulate cyclin E2 and inhibit proliferation of tamoxifen resistant breast cancer cells was validated. Our findings demonstrate that an integrated bioinformatics approach to analyze gene expression profiles from multiple breast tumor datasets can identify important biological pathways and potentially novel therapeutic options for tamoxifen-resistant breast cancers

    Perforated Thermal Mass Shading: An Approach to Winter Solar Shading and Energy, Shading and Daylighting Performance

    No full text
    Direct solar irradiance may cause thermal discomfort, even in winter when the ambient temperature is low and especially for high-altitude locations with a high intensity of solar radiation. Thus winter solar shading might be required and, if used, must achieve a balance between the prevention of the transmittance of solar irradiance, the utilization of passive solar heat and the supply of adequate natural daylighting. These considerations render conventional solutions of solar shading inapplicable in the winter. In this paper, a novel approach to perforated thermal mass shading for winter is reported and examined. The impacts of the perforated percentage and the opening positions of this shading device on energy, shading and daylighting performance were assessed for south- and west-facing orientations. A range of perforated percentages and vertical and horizontal positions were tested using simulations by Energyplus and Daysim. Our results indicate that the proposed perforated thermal mass shading is efficient for the integrated performance of shading, daylighting and energy savings in the south-facing orientation, while it achieves acceptable performance in shading and daylighting in the west-facing orientation for a high-altitude cold climate

    Online Multikernel Learning Based on a Triple-Norm Regularizer for Semantic Image Classification

    Get PDF
    Currently image classifiers based on multikernel learning (MKL) mostly use batch approach, which is slow and difficult to scale up for large datasets. In the meantime, standard MKL model neglects the correlations among examples associated with a specific kernel, which makes it infeasible to adjust the kernel combination coefficients. To address these issues, a new and efficient multikernel multiclass algorithm called TripleReg-MKL is proposed in this work. Taking the principle of strong convex optimization into consideration, we propose a new triple-norm regularizer (TripleReg) to constrain the empirical loss objective function, which exploits the correlations among examples to tune the kernel weights. It highlights the application of multivariate hinge loss and a conservative updating strategy to filter noisy samples, thereby reducing the model complexity. This novel MKL formulation is then solved in an online mode using a primal-dual framework. A theoretical analysis of the complexity and convergence of TripleReg-MKL is presented. It shows that the new algorithm has a complexity of OCMT and achieves a fast convergence rate of OlogT/T. Extensive experiments on four benchmark datasets demonstrate the effectiveness and robustness of this new approach

    Design of soil moisture sensor based on phase-frequency characteristics of RC networks

    No full text
    Dielectric-based methods are widely used due to their non-destruction, efficiency and accuracy. The capacitance of the probe on the sensor is affected by the soil moisture. Therefore the mathematical model can be built between the capacitance of the sensor and the soil moisture. In this paper, a new soil water content sensor based on the phase-frequency characteristic of RC network is proposed. The sensor consists of four parts, that is a VHF oscillator, a phase-detecting circuit, a first-order RC low-pass circuit, and a probe. The VHF oscillator outputs a frequency-specified f* signal to drive the RC network, and the capacitor C of the first-order RC low-pass network is replaced by the capacitance of the probe of the sensor. Moreover, the changes of capacitance of the probe brought by the change of the soil moisture will cause a significant change in the phase-frequency response of the RC network. The AD8302 phase-detector is used to measure the change of the phase-frequency response of the RC network by converting the phase angle of the RC network to a voltage signal. Thus, the relationship between the soil moisture content and the output voltage signal can be built to estimate water content in soil. Compared with existing published works on the theoretical implementation which has low accuracy and sensitivity of the sensor, the proposed sensor is optimized by the following steps: 1) The measurement equivalent circuit model of the first-order RC low-pass circuit along with the input equivalent circuit of AD8302 is built; 2) The relationship between the output voltage signal of AD8302 with the phase-frequency response of the measurement equivalent circuit with a specified frequency f and the resistor R of RC network is derived; 3) Formulating the optimization problem by maximizing the integration of change of the output voltage of AD8302 in the entire predefined variation range of the capacitor C of the RC circuit, 1×10-12 F<C< 1×10-8 F, subjecting to f and R; 4) Solving the objective function by Genetic Algorithm (GA) to obtain the optimal f*=1.9412×108 Hz and R*=13.1 Ω, making the sensor achieve the highest sensitivity and accuracy of the measurement of the changes of C due to the variations of the water content in soil. Experiments on the sensor are divided in the following two steps. First, the sensor is calibrated in a series of tested solution with different equivalent soil gravimetric water content, and the gravimetric water content prediction model is built as y=-79133x3-18141x2-1418x+0.5926 with the coefficients of determination R2=0.9889. Second, the sensor is evaluated in the soil samples with different gravimetric water content. The maximum prediction and average errors are 4.58% and 1.63%, respectively

    Effect of Preoperative Oral Carbohydrate Loading on Body Temperature during Combined Spinal-Epidural Anesthesia for Elective Cesarean Delivery

    No full text
    BACKGROUND: Intraoperative hypothermia is a common complication after neuraxial block in cesarean delivery. At least 1 animal study has found that carbohydrate loading can maintain the body temperature of rats during general anesthesia, but it is unclear whether carbohydrate loading is beneficial for body temperature maintenance in parturient women during combined spinal-epidural anesthesia (CSEA) for elective cesarean delivery. METHODS: Women undergoing elective cesarean delivery were randomized into a control group (group C), an oral carbohydrate group (group OC), or an oral placebo group (group OP), with 40 women in each group. Core body temperature (Tc) and body surface temperature (Ts) before and after cesarean delivery, changes in Tc (ΔTc) and Ts (ΔTs), and the incidence of side effects (eg, intraoperative shivering) were compared among the groups. RESULTS: The postoperative Tc (core body temperature after cesarean delivery [Tc2]) of group OC (36.48 [0.48]°C) was higher than those of group C (35.95 [0.55]°C; P \u3c.001), and group OP (36.03 [0.49]°C; P =.001). The ΔTc (0.30 [0.39]°C) in group OC was significantly smaller than those in group C (0.73 [0.40]°C; P =.001) and group OP (0.63 [0.46]°C; P =.0048). CONCLUSIONS: Oral carbohydrate loading 2 hours before surgery facilitated body temperature maintenance during CSEA for elective cesarean delivery

    Potassium content prediction model of citrus leaves in different phenological period

    Get PDF
    Based on reflectance spectra, the potassium (K) content prediction model was established to realize non-destructive testing of K content in citrus trees. Field experiments were conducted on 117 planted Luogang citrus trees in the Crab Village, and the data was collected on fresh and healthy citrus leaves in four dominant phenological periods. The hyper-spectrometer ASD FieldSpec3 and the flame photometry were used to detect spectral reflectance data and K-contents, respectively. A series of experiments were conducted to analyze the sensitive frequency band of K-contents and the modeling regularity of prediction in different phonological periods. The results show that there is frequency drift of K-contents relevant sensitive band in different phenological periods. Compared with MLR, SVR and PLS, better prediction results can be obtained based on K-contents relevant sensitive frequency band. The R2 of 0.994 and the mean square error of 0.120 with mean relative error of 1.33% are obtained in SVR model on validation set, which illuminates that SVR can well predict K-contents in whole growth periods based on reflectance spectra, regardless of frequency drift and the discrepant model performance
    corecore